Data analysis and statistical modelling  

Prerequisites Probability and Statistics. Objectives - Introduction to Applied Statistics and its relevance in Data Science. - Analyze real data using statistical methods to extract relevant information about them and solve practical problems using statistical software. - Know the advantages and limitations of various statistical methodologies to make out the most of them in solving real problems. - Find statistical evidence in the data based on models adjusted to the observations collected. Infer about hypotheses of interest associated with the selected models. - Solve a real problem using the knowledge accumulated in this course: computational project. Program 1. Exploratory data analysis: (i) Introduction to R. (ii) Visualization of different types of data. (iii) Treatment of missing values. (iv) Outlier detection. 2. Dimensionality reduction: principal component analysis. Covariance and correlation matrices. 3. Regression models: Gaussian, Logistic, Poisson. Variable Selection. Diagnostic Techniques. Model validation. Prediction. 4. Modeling independent data versus time dependent data. 5. Resampling methods: Jackknife, bootstrap, permutation testing and cross-validation. 6. Elements of the Bayesian methodology: a priori representation (conjugate and non-informative distributions), inference by the Bayes theorem and applications to real data problems. 7. Classification: Total probability of misclassification, Fisher linear discriminant analysis, Bayes classification rule. Evaluation of the performance of a classification rule. Evaluation Methodology A Test of 1h30m (50%), with a minimum grade of 8.0, and a Computational Project (50%) Cross-Competence Component Critical and Innovative Thinking - Project realization involves components of strategic thinking, critical thinking, creativity, and problem-solving strategies without explicit evaluation. Intrapersonal Competencies - Project realization involves components of productivity and time management, stress management, proactivity and initiative, intrinsic motivation and decision making without explicit evaluation. Interpersonal Skills - In assessing the project report, 10% of the rating is given to the form of the reports and 10% of the rating is given to the oral presentation and discussion of the project. Laboratorial Component Laboratory work performed with the help of R (or equivalent). Programming and Computing Component The laboratory and project work involve R programming. The evaluation percentage in this component is 50%. More information at: https://fenix.tecnico.ulisboa.pt/cursos/lerc/disciplina-curricular/845953938490004
Presential
English
Data analysis and statistical modelling
English

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or HaDEA. Neither the European Union nor the granting authority can be held responsible for them. The statements made herein do not necessarily have the consent or agreement of the ASTRAIOS Consortium. These represent the opinion and findings of the author(s).